MAP adaptation of stochastic grammars
نویسندگان
چکیده
This paper investigates supervised and unsupervised adaptation of stochastic grammars, including ngram language models and probabilistic context-free grammars (PCFGs), to a new domain. It is shown that the commonly used approaches of count merging and model interpolation are special cases of a more general maximum a posteriori (MAP) framework, which additionally allows for alternate adaptation approaches. This paper investigates the effectiveness of different adaptation strategies, and, in particular, focuses on the need for supervision in the adaptation process. We show that n-gram models as well as PCFGs benefit from either supervised or unsupervised MAP adaptation in various tasks. For n-gram models, we compare the benefit from supervised adaptation with that of unsupervised adaptation on a speech recognition task with an adaptation sample of limited size (about 17 h), and show that unsupervised adaptation can obtain 51% of the 7.7% adaptation gain obtained by supervised adaptation. We also investigate the benefit of using multiple word hypotheses (in the form of a word lattice) for unsupervised adaptation on a speech recognition task for which there was a much larger adaptation sample available. The use of word lattices for adaptation required the derivation of a generalization of the well-known Good-Turing estimate. 0885-2308/$ see front matter ! 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.csl.2004.12.001 q Some of the results reported here were first reported in Bacchiani and Roark (2003); Roark and Bacchiani (2003) and Riley et al. (2003). This work was done while the authors were at AT&T Labs – Research. * Corresponding author. E-mail addresses: [email protected] (M. Bacchiani), [email protected] (M. Riley), [email protected] (B. Roark), [email protected] (R. Sproat). www.elsevier.com/locate/csl Computer Speech and Language 20 (2006) 41–68 COMPUTER SPEECH AND LANGUAGE Using this generalization, we derive a method that uses Monte Carlo sampling for building Katz backoff models. The adaptation results show that, for adaptation samples of limited size (several tens of hours), unsupervised adaptation on lattices gives a performance gain over using transcripts. The experimental results also show that with a very large adaptation sample (1050 h), the benefit from transcript-based adaptation matches that of lattice-based adaptation. Finally, we show that PCFG domain adaptation using the MAP framework provides similar gains in F-measure accuracy on a parsing task as was seen in ASR accuracy improvements with n-gram adaptation. Experimental results show that unsupervised adaptation provides 37% of the 10.35% gain obtained by supervised adaptation. ! 2004 Elsevier Ltd. All rights reserved.
منابع مشابه
Constraint-based RMRS Construction from Shallow Grammars
We present a constraint-based syntax-semantics interface for the construction of RMRS (Robust Minimal Recursion Semantics) representations from shallow grammars. The architecture is designed to allow modular interfaces to existing shallow grammars of various depth – ranging from chunk grammars to context-free stochastic grammars. We define modular semantics construction principles in a typed fe...
متن کاملSupervised and unsupervised PCFG adaptation to novel domains
This paper investigates adapting a lexicalized probabilistic context-free grammar (PCFG) to a novel domain, using maximum a posteriori (MAP) estimation. The MAP framework is general enough to include some previous model adaptation approaches, such as corpus mixing in Gildea (2001), for example. Other approaches falling within this framework are more effective. In contrast to the results in Gild...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملStochastic Categorial Grammars
Statistical methods have turned out to be quite successful in natural language processing. During the recent years, several models of stochastic grammars have been proposed, including models based on lexicalised context-free grammars [3], tree adjoining grammars [15], or dependency grammars [2, 5]. In this exploratory paper, we propose a new model of stochastic grammar, whose originality derive...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 20 شماره
صفحات -
تاریخ انتشار 2006